Profiling of Code_Saturne with HPCToolkit and TAU, and autotuning Kernels with Orio

نویسندگان

  • B. Lindi
  • T. Ponweiser
  • P. Jovanovic
  • T. Arslan
چکیده

This study has profiled the application Code Saturne, which is part of the PRACE benchmark suite. The profiling has been carried out with the tools HPCtookit and Tuning and Analysis Utilities (TAU) with the target of finding compute kernels suitable for autotuning. Autotuning is regarded as a necessary step in achieving sustainable performance at an Exascale level as Exascale systems most likely will have a heterogeneous runtime environment. A heterogeneous runtime environment imposes a parameter space for the applications run time behavior which cannot be explored by a traditional compiler. Neither can the run time behavior be explored manually by the developer/code owner as this will be too time consuming. The tool Orio has been used for autotuning idenitified compute kernels. Orio has been used on traditional Intel processors, Intel Xeon Phi and NVIDIA GPUs.The compute kernels have a small contribution to the overall execution time for Code Saturne. By autotuning with Orio these kernels have been improved by 3-5%..

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance analysis of large scale parallel CFD computing based on Code_Saturne

In order to run computational fluid dynamics (CFD) codes on large scales, parallel computing has to be employed. For instance, on Petascale computing, general parallel computing without any optimization is not enough, especially for complex industrial issues that employ a large number of mesh cells to capture the details of the geometry. How to distribute these mesh cells among the multi-proces...

متن کامل

Application Performance Profiling on the Cray XD1 using HPCToolkit∗

HPCToolkit is an open-source suite of multi-platform tools for profile-based performance analysis of sequential and parallel applications. The toolkit consists of components for collecting performance measurements of fully-optimized executables without adding instrumentation, analyzing application binaries to understand the structure of optimized code, correlating measurements with program stru...

متن کامل

HPCTOOLKIT: tools for performance analysis of optimized parallel programs

HPCTOOLKIT is an integrated suite of tools that supports measurement, analysis, attribution, and presentation of application performance for both sequential and parallel programs. HPCTOOLKIT can pinpoint and quantify scalability bottlenecks in fully-optimized parallel programs with a measurement overhead of only a few percent. Recently, new capabilities were added to HPCTOOLKIT for collecting c...

متن کامل

Tools for machine-learning-based empirical autotuning and specialization

The process of empirical autotuning results in the generation of many code variants which are tested, found to be suboptimal, and discarded. By retaining annotated performance profiles of each variant tested over the course of many autotuning runs of the same code across different hardware environments and different input datasets, we can apply machine learning algorithms to generate classifier...

متن کامل

Inhibitory effect of corcin on aggregation of 1N/4R human tau protein in vitro

Objective(s):Alzheimer's disease (AD) is the most common age-related neurodegenerative disorder. One of the hallmarks of AD is an abnormal accumulation of fibril forms of tau protein which is known as a microtubule associated protein. In this regard, inhibition of tau aggregation has been documented to be a potent therapeutic approach in AD and tauopathies. Unfortunately, the available syntheti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014